Mongodb to SQL Migration 플랜
기존의 고객 데이터베이스들을 SQL 기반 DB로 옮기는 것 까지는 기정사실이긴 한데, 구체적으로 어느 SQL로 가야 하는가? Postgres, MySQL, MariaDB, Oracle, Microsoft, SQLite 이렇게 다양한데 판단의 기준을 세워야 할 것 같다.
판단의 기준
1. 가격
엔터프라이즈 DB는 정말 더럽게 비싸다. 목표는 MongoDB보다 저렴하고 AWS 비용까지 고려했을 때 합리적인 가격이어야 한다.
https://aws.amazon.com/rds/pricing/
DB Type | Instances | Price Per Hour | |
---|---|---|---|
RDS for Oracle License Included (LI) Single-AZ | db.t3.small | $0.088 | |
RDS for Oracle License Included (LI) Multi-AZ | db.t3.small | $0.176 | |
RDS for Oracle BYOL [1] Single-AZ Deployment | db.t3.small | $0.052 | |
RDS for Oracle BYOL [1:1] Multi-AZ Deployment | db.t3.small | $0.104 | |
RDS for MySQL Single-AZ Deployment | db.t4g.micro | $0.025 | |
RDS for MySQL Single-AZ Deployment | db.t4g.small | $0.051 | |
RDS for MySQL Multi-AZ Deployment | db.t4g.micro | $0.051 | |
RDS for MySQL Multi-AZ Deployment | db.t4g.small | $0.102 | |
Amazon RDS for PostgreSQL Single-AZ Deployment | db.t4g.micro | $0.025 | |
Amazon RDS for PostgreSQL Single-AZ Deployment | db.t4g.small | $0.051 | |
Amazon RDS for PostgreSQL Multi-AZ Deployment | db.t4g.micro | $0.051 | |
Amazon RDS for PostgreSQL Multi-AZ Deployment | db.t4g.small | $0.102 | |
RDS for MariaDB Single-AZ Deployment | db.t4g.micro | $0.025 | |
RDS for MariaDB Single-AZ Deployment | db.t4g.small | $0.051 | |
RDS for MariaDB Multi-AZ Deployment | db.t4g.micro | $0.051 | |
RDS for MariaDB Multi-AZ Deployment | db.t4g.small | $0.102 | |
RDS for SQL Server On-Demand Instances | db.t3.micro | $0.031 | |
RDS for SQL Server On-Demand Instances | db.t3.small | $0.062 |
AWS Aurora는 MySQL, PostgreSQL 호환이 되는 고성능 데이터베이스 서비스이다. 두가지 버전이 있는데, IO에 요금을 부과하는 Aurora Standard와 스토리지 저장용량에 요금을 부과하는 Aurora I/O-Optimized 서비스가 존재한다.
AWS Aurora는 다양한 Capacity 옵션이 존재한다. Serverless, Provisioned On-Demand Instance, Provisioned Reserved Instance, Limitless.
AWS Aurora Serverless 옵션은 서버리스 특성상 자동으로 스케일링을 조절한다. 요청이 없으면 스스로 꺼지고 갑자기 많은 양의 트랜잭션이 요청되면 자원을 수평적으로 할당한다. Provisioned 옵션과 달리 항시 켜져있지 않기 때문에 사용시간을 특별한 ACU
(Aurora Capacity Units) 단위를 가지고 요금을 책정하는데, 1초 단위로 아주 정밀하게 요금을 계산한다.
Measure | Aurora Standard (per ACU hour) | Aurora I/O-Optimized (per ACU hour) |
---|---|---|
Aurora Capacity Unit | $0.20 | $0.26 |
서버리스가 싫으면 Provisioned On-Demand Instance 혹은 Provisioned Reserved Instance를 사용하면 된다. 전자는 일반적인 RDS 인스턴스를 월 단위로 지불하는 방식이고, Reserved Instance는 한 번에 몇개월/몇년치 금액을 지불하여 쓰는 방식이다.
Standard Instances - Current Generation | Aurora Standard (Price Per Hour) | Aurora I/O-Optimized (Price Per Hour) |
---|---|---|
db.t4g.medium | $0.113 | $0.147 |
db.t4g.large | $0.227 | $0.295 |
db.t3.medium | $0.125 | $0.163 |
db.t3.large | $0.25 | $0.325 |
2. 기술지원
돈을 많이 내면 (엔터프라이즈 요금제를 사용하면) 각종 기술지원 및 사후지원이 제공된다. PostgreSQL도 엔터프라이즈 고객에 대한 기술지원이 제공된다고 하는데, 확실히 날 것 그대로의 오픈소스 보다는 기술보증이 탄탄하니까 돈이 오가는 크리티컬한 업계라면 고려해볼만하다.
3. 성능
어차피 NoSQL에서 넘어오는데... 아무 SQL DB를 사용해도 성능향상은 보장된다.
4. ORDB
PostgreSQL 때문에 고민이다. 얘는 그냥 RDBMS가 아니라 ORDBMS라서 "객체" 개념이 들어간다. Schema에 Function이 달릴 수 있는 영락없는 객체라고. 이걸로 어떤 차별성을 가질 수 있는지도 잘 모르겠고, 우리 도메인에 ORDB가 필요한지도 알아내야 한다.
5. Type
Enum, Uuid, Decimal, Json, Array같은 타입을 우리 서비스에서 사용하는가? 그러면 가능한 최대한 많은 타입을 지원하는 DB를 써야 유리할 것이다. 안 그러면 마이그레이션할 때 변환 로직을 추가해야 할 것이다.
2025-03-22 edited
PostgreSQL 쪽으로 선호도가 기울고 있다. 가장 큰 건 Json 타입으로, 값객체를 별도의 테이블이 아닌 자체의 컬럼으로 관리할 수 있을 뿐더러, JSON 프로퍼티를 쿼리하는 문법도 존재하기 때문에 --물론 이 방법은 아마도 raw query를 써야겠지만-- 마이그레이션에 상당한 이점을 가져갈 것으로 보인다. 냅다 마이그레이션을 할 수는 없는게, MongoDB의 ObjectID를 어떻게 PK 필드로 전환할 수 있을지와 릴레이션이 끊어진 테이블 간의 관계도를 재구축하여야 한다.
AI Response :: Claude 3.7 Sonnet
Based on your existing backend system built with NestJS, MongoDB, and Prisma ORM, I'll outline a comprehensive strategy for migrating to a SQL-based database. From your requirements and the notes in the paste.txt document, PostgreSQL appears to be the preferred option, so I'll focus primarily on that while highlighting considerations for other SQL databases.
1. PostgreSQL as the Primary Target
PostgreSQL offers several advantages that align with your needs:
- JSON Support: PostgreSQL's robust JSONB type can help manage complex nested data structures currently in MongoDB, enabling easier migration of value objects
- Type System: Rich type system including arrays, UUIDs, enums, and custom types
- Indexing Capabilities: Comprehensive indexing including partial and expression indexes
- Cost-Effective: Comparable pricing to other open-source SQL options on AWS RDS
- Object-Relational Features: ORDBMS capabilities provide flexibility for complex domain modeling
2. Migration Strategy Overview
Phase 1: Schema Design & Preparation
-
Map Document Schema to Relational Schema
- Convert MongoDB collections to SQL tables
- Define explicit relations between entities
- Design primary and foreign key relationships
-
Handle ObjectId Migration
- Option 1: Convert ObjectIds to UUIDs (preferred for PostgreSQL)
-- PostgreSQL schema example CREATE TABLE users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), username VARCHAR(255) NOT NULL, /* other fields */ );
- Option 2: Preserve original ObjectId strings in a character field
-- For compatibility during transition CREATE TABLE users ( id VARCHAR(24) PRIMARY KEY, mongo_id VARCHAR(24) UNIQUE, -- Store original during migration /* other fields */ );
-
JSON Field Planning
- Map embedded MongoDB documents to PostgreSQL JSONB fields
- Example for address object in user model:
CREATE TABLE users ( id UUID PRIMARY KEY, /* other fields */ address JSONB -- Store complex address object );
-
Schema Validation
- Create a test schema and validate with sample data
- Confirm all domain requirements are fulfilled by the relational design
Phase 2: Implementation & Migration
-
Update Prisma Schema
- Modify your Prisma schema to work with PostgreSQL
- Add explicit relations and constraints
- Example:
model User { id String @id @default(uuid()) name String email String @unique tickets Ticket[] createdAt DateTime @default(now()) updatedAt DateTime @updatedAt address Json? // For complex MongoDB documents }
-
Database Migration Tooling
- Use a combination of custom scripts and tools like MongoDB Compass for export
- Consider leveraging AWS Database Migration Service if using AWS infrastructure
- For complex migrations, implement a custom ETL pipeline
-
Data Transformation Scripts
- Create scripts to handle data transformation between document and relational models
- Transform ObjectIds to UUIDs/strings
- Resolve embedded documents to related tables
- Handle arrays and complex MongoDB-specific structures
-
Incremental Migration Approach
- Migrate individual collections/modules separately
- Start with less complex, less critical modules
- Implement a dual-write strategy during transition
-
API & Service Layer Updates
- Modify service classes to handle SQL-specific operations
- Update repository patterns/DAOs to work with relational data
- Keep service interfaces consistent for client compatibility
Phase 3: Verification & Optimization
-
Testing Strategy
- Create comprehensive test suites for migrated functionality
- Implement parallel testing against both databases
- Performance testing to validate SQL performance gains
-
Indexing Optimization
- Identify high-volume queries from current system
- Implement appropriate indexes for these queries
- Example:
CREATE INDEX idx_user_username ON users(username); CREATE INDEX idx_academy_owner ON academies(owner_id); CREATE INDEX idx_ticket_plan_type ON ticket_plans(type);
-
Query Optimization
- Analyze and optimize complex queries
- Consider denormalization for performance-critical paths
- Implement caching strategies where appropriate
3. Specific Migration Challenges & Solutions
Complex Document Structures
Challenge: MongoDB's flexible schema allows deeply nested documents and arrays.
Solution:
- Use PostgreSQL's JSONB for complex value objects that don't need to be queried often
- Normalize important nested structures into related tables
- For arrays, consider junction tables or PostgreSQL array types based on use case
Example conversion:
// MongoDB document
{
_id: ObjectId("..."),
name: "Tennis Academy",
address: {
detail: "123 Main St",
location: {
coordinates: [127.1, 37.5],
type: "Point"
},
oldAddress: "Old address",
roadAddress: "Road address"
},
images: ["url1", "url2"]
}
PostgreSQL approach:
CREATE TABLE academies (
id UUID PRIMARY KEY,
name VARCHAR(255) NOT NULL,
address JSONB, -- Store complex address object as JSONB
images TEXT[] -- Use PostgreSQL array for simple string arrays
);
ObjectId References
Challenge: MongoDB uses ObjectId references between collections.
Solution:
- Create explicit foreign key relationships
- Use UUIDs for new records
- Map original ObjectIds to new primary keys during migration
CREATE TABLE academies (
id UUID PRIMARY KEY,
-- other fields
);
CREATE TABLE courts (
id UUID PRIMARY KEY,
academy_id UUID NOT NULL REFERENCES academies(id),
-- other fields
);
Handling Transactions
Challenge: Moving from MongoDB's document-level atomicity to SQL transactions.
Solution:
- Use PostgreSQL's transaction capabilities for multi-table operations
- Update service layer to use transactions for operations that modify multiple entities
- Example in NestJS with Prisma:
async createAcademyWithCourts(data: CreateAcademyDto): Promise<Academy> {
return this.prisma.$transaction(async (tx) => {
const academy = await tx.academy.create({
data: {
name: data.name,
// other fields
},
});
for (const court of data.courts) {
await tx.court.create({
data: {
name: court.name,
academyId: academy.id,
// other fields
},
});
}
return academy;
});
}
4. Technical Implementation Details
Schema Conversion Examples
Here are specific schema conversion examples based on your Prisma models:
MongoDB Academy Model:
model Academy {
id String @id @default(auto()) @map("_id") @db.ObjectId
name String
owner String @db.ObjectId
// other fields
address AcademymodelsAddress
// relations
Tag AcademyTag[]
TicketPlan TicketPlan[]
}
PostgreSQL Conversion:
model Academy {
id String @id @default(uuid())
name String
owner String @db.Uuid
// other fields
address Json // Store as JSONB in PostgreSQL
// relations with explicit relations
tags AcademyTag[]
ticketPlans TicketPlan[]
@@index([name])
@@index([owner])
}
NestJS Service Layer Updates
Your NestJS services will need modifications to handle SQL-specific operations:
// Before (MongoDB)
async findUserTickets(userId: string): Promise<Ticket[]> {
return this.prisma.ticket.findMany({
where: {
user: userId, // ObjectId reference
},
});
}
// After (PostgreSQL)
async findUserTickets(userId: string): Promise<Ticket[]> {
return this.prisma.ticket.findMany({
where: {
userId: userId, // Explicit foreign key
},
include: {
ticketPlan: true, // Explicitly include relations
},
});
}
Indexing Strategy
Based on your schema and likely query patterns:
-
Primary Indexes:
- All primary keys and foreign keys
-
Performance Indexes:
- User lookups:
username
,phone
,email
- Academy lookups:
name
,owner
, location (PostgreSQL GiST index for geospatial) - Ticket queries:
user
,status
,expireTime
- Payment queries:
orderId
,status
,createdTime
- User lookups:
-
Composite Indexes:
(academy, court, from, to)
for reservation availability checks(user, ticketPlan, isExpired)
for active ticket queries
-- Example indexes for PostgreSQL
CREATE INDEX idx_ticket_user_status ON tickets(user_id, status);
CREATE INDEX idx_reservation_time ON reservations(court_id, start_time, end_time);
CREATE INDEX idx_academy_location ON academies USING GIST float, (address->>'latitude')::float), 4326);
5. PostgreSQL-Specific Advantages
-
JSONB Operations:
- Query inside JSON structures:
SELECT * FROM academies WHERE address->'location'->'coordinates'->>0 > '127.0';
- Index JSON properties:
CREATE INDEX idx_academy_city ON academies ((address->>'city'));
-
Array Operations:
- Query array elements:
SELECT * FROM courts WHERE 'basketball' = ANY(sports);
-
Custom Types and Functions:
- Create domain-specific types:
CREATE TYPE reservation_status AS ENUM ('pending', 'confirmed', 'cancelled');
- Implement business logic in database functions:
CREATE FUNCTION check_reservation_availability(court_id UUID, start_time TIMESTAMP, end_time TIMESTAMP) RETURNS BOOLEAN AS $ -- Implementation $ LANGUAGE plpgsql;
6. Migration Timeline and Milestones
-
Preparation Phase (3-4 weeks)
- Schema design and mapping
- Test environment setup
- Migration script development
-
Testing Phase (2-3 weeks)
- Data migration verification
- Performance testing
- API compatibility testing
-
Incremental Migration (6-10 weeks)
- Migrate modules in order: Users → Academies → Courts → Ticket Plans → Tickets → Reservations → Payments
- Implement dual-write for critical paths
- Incremental cutover by module
-
Optimization Phase (2-4 weeks)
- Performance tuning
- Index optimization
- Query refinement
-
Monitoring and Verification (2 weeks)
- Extended monitoring period
- Performance baseline establishment
- Final optimization adjustments
7. Cost Considerations
Based on your workload:
-
PostgreSQL on RDS:
- Starting with a db.t4g.small ($0.051/hour) for development
- Scaling to db.t4g.medium/large for production
- Multi-AZ for production environment ($0.102/hour)
-
PostgreSQL Aurora:
- Consider Serverless v2 for variable workloads
- Set minimum and maximum ACU based on typical load patterns
- Estimated cost: $0.20-0.26 per ACU hour
-
Migration Costs:
- Development time for migration scripts
- Potential dual-running costs during transition
- AWS DMS costs if using that service
Conclusion
PostgreSQL offers the best balance of features, performance, and cost-effectiveness for your migration from MongoDB. The JSONB type provides a bridge between document and relational models, making the migration path smoother while giving you all the benefits of a robust relational database.
The migration strategy outlined above prioritizes:
- Data integrity and consistency
- Minimal disruption to services
- Incremental approach to reduce risk
- Performance optimization
- Leveraging PostgreSQL's unique features to maintain MongoDB-like flexibility where needed
By following this approach, you can successfully transition from MongoDB to PostgreSQL while improving your system's performance, maintainability, and cost-effectiveness.
AI Response :: o3-mini-high
Below is a comprehensive plan for migrating from MongoDB to a SQL-based system using Prisma ORM. This plan assumes you already have a well-structured Prisma schema for MongoDB and outlines the steps from schema conversion to data migration and verification.
1. Assess and Redesign the Data Model
- Review Your Existing Schema:
Since your MongoDB schema is structured via Prisma, start by reviewing your current models. Identify:- Collections (models) and their fields.
- Data types that differ between MongoDB and SQL (for example, MongoDB’s ObjectID vs. SQL’s UUID or auto-incrementing integer).
- Any embedded documents or arrays that might require normalization in a relational model.
- Redesign for Relational Constraints:
- Convert embedded documents into separate tables if needed.
- Define explicit relationships (one-to-many, many-to-many) using foreign keys.
- Adjust indexes and constraints (unique, not null, etc.) to suit a relational engine.
2. Update the Prisma Schema
-
Change the Provider:
In yourschema.prisma
file, switch theprovider
from"mongodb"
to your chosen SQL provider (e.g.,"postgresql"
,"mysql"
, or"sqlite"
). For example:datasource db { provider = "postgresql" // or "mysql" / "sqlite" url = env("DATABASE_URL") }
-
Adapt Model Definitions:
- Update field types where necessary (for example, converting
String @id @default(auto())
from MongoDB’s ObjectID to an appropriate SQL equivalent, such as using@id @default(autoincrement())
for integer keys or@id @default(uuid())
for UUIDs). - Define relation fields using Prisma’s relational syntax.
- Update field types where necessary (for example, converting
-
Regenerate the Prisma Client:
Once the schema is updated, run:npx prisma generate
This will ensure your client is in sync with the updated schema.
3. Set Up the SQL Database Environment
-
Provision a New SQL Instance:
Set up your SQL database (e.g., PostgreSQL or MySQL) and update your environment variables (DATABASE_URL
) accordingly. -
Run Prisma Migrations:
Use Prisma Migrate to create and apply migration files that build the new SQL schema:npx prisma migrate dev --name init
This creates the SQL schema in your new database.
4. Plan and Execute Data Migration
-
Data Extraction:
Write a script (using Node.js and the Prisma Client) that connects to your MongoDB instance, reads the data, and prepares it for insertion into the SQL database. -
Data Transformation:
- Convert data types (for example, format dates, convert MongoDB ObjectIDs, and flatten nested documents if necessary).
- Handle relational data by ensuring that foreign key relationships are correctly established. This might involve a two-phase approach: first inserting “parent” records, then “child” records.
-
Data Loading:
Write to your new SQL database using Prisma Client. Depending on the volume:- For smaller datasets, a straightforward ETL script might suffice.
- For larger datasets, consider batching the operations and using transactions to maintain consistency.
Example (simplified):
const { PrismaClient } = require('@prisma/client'); const prismaMongo = new PrismaClient({ datasources: { db: { url: process.env.MONGO_URL } } }); const prismaSQL = new PrismaClient({ datasources: { db: { url: process.env.SQL_URL } } }); async function migrateData() { // Extract data from MongoDB const mongoRecords = await prismaMongo.myCollection.findMany(); // Transform and load into SQL for (const record of mongoRecords) { await prismaSQL.myCollection.create({ data: { // Map and transform fields as needed name: record.name, createdAt: new Date(record.createdAt), // ...other fields... }, }); } } migrateData() .then(() => console.log('Migration completed')) .catch((e) => console.error(e)) .finally(async () => { await prismaMongo.$disconnect(); await prismaSQL.$disconnect(); });
-
Testing the Migration:
- Verify that all records have been transferred accurately.
- Compare record counts and perform spot-checks on key data.
- Ensure relational integrity is maintained.
5. Update Application Code and Deployment
-
Switch the Database Connection:
Change your application’s configuration to point to the new SQL database. Make sure all environment variables and Prisma client initializations reference the correctDATABASE_URL
. -
Test the Application:
Run end-to-end tests to ensure that your application works with the SQL backend. Check all CRUD operations, relations, and any SQL-specific features. -
Plan for Rollback:
Keep the MongoDB instance running in parallel (if possible) until you are confident that the SQL database is fully operational. Have a rollback plan in case issues arise.
6. Monitor and Optimize
-
Performance Monitoring:
After migration, monitor database performance. SQL databases have different indexing and query optimization strategies than MongoDB. -
Optimize Indexes and Queries:
Based on your application’s usage patterns, consider adjusting indexes and query structures to take full advantage of the relational database engine. -
Document Learnings:
Record any challenges or adjustments made during the migration process to aid future migrations or troubleshooting.
Summary
This migration plan involves careful redesign of your data model, updating the Prisma schema, setting up a new SQL environment, and writing an ETL process for data migration. Testing at every stage is crucial to ensure data integrity and application stability. Once the migration is complete, updating your application configuration and monitoring performance will ensure a smooth transition from MongoDB to your new SQL-based system.
Feel free to reach out with any follow-up questions or for further details on specific parts of the migration process.